Skip to content

Doc validation harness: test the toggle + statusline instructions against the real binary#2

Open
Tombar wants to merge 18 commits into
mainfrom
docs-validation-harness
Open

Doc validation harness: test the toggle + statusline instructions against the real binary#2
Tombar wants to merge 18 commits into
mainfrom
docs-validation-harness

Conversation

@Tombar
Copy link
Copy Markdown
Member

@Tombar Tombar commented Jun 2, 2026

Summary

Adds a committed bats-core harness (test/docs/) that validates the copy-paste
instructions in docs/toggle/{wezterm,iterm,tmux}.md and docs/claude-statusline.md by
extracting the literal fenced code blocks from the docs and running them against the
installed failsafe binary in a sandboxed HOME. Wired into a separate CI workflow so
doc drift is caught on every push.

The harness already paid for itself: it caught a real bug in docs/toggle/iterm.md
(an unused app = await iterm2.async_get_app(connection) line, flagged by pyflakes) —
fixed here.

What's covered (34 headless tests, green in CI)

  • cross-cutting — mode-source chain order, missing-file⇒read, rego matches "read", rw/ro alias normalization, tab-delimited mode get.
  • claude-statusline.sh🔒 read/🔓 write glyphs, jq enrichment, graceful degrade without jq (proven: no cwd appended), single-line output.
  • tmux (live headless tmux) — toggle script flips the file, #{pane_id}==$TMUX_PANE, C-M-t registration, status-bar colors, no-script failsafe toggle.
  • wezterm — the doc's own Lua toggle_mode executed via a wezterm stub + driver and cross-checked against failsafe mode get; luacheck (runs in CI on lua5.4); sudo-timeout revert.
  • iterm — OSC-1337 base64 roundtrip, the doc's own read_mode exec'd from its AST, py_compile + import iterm2, pyflakes, no-python toggle.

How it works

  • lib/extract.sh pulls the exact Nth fenced block of a language from a .md (heading-anchored), so tests run the doc's bytes, not copies — editing a snippet that breaks its contract fails CI.
  • helpers.bash sandboxes HOME per test (mktemp -d), so toggling never touches the real ~/.claude; teardown only removes its own temp dirs.
  • Missing tools skip with a reason (never a vacuous pass); CI installs everything, so a skip in CI is itself a signal.

Not automated (honestly labeled)

GUI-only surfaces — WezTerm toasts/format-tab-title, iTerm2's Python-runtime keybinding — are STATIC / LIVE-MANUAL. A local-only make validate-docs-live launches real WezTerm (wezterm show-keys) to confirm the snippet loads and binds Ctrl+Alt+t. See test/docs/REPORT.md for the per-claim results.

Test Plan

  • make validate-docs — 34/34 pass locally
  • go test ./... — unaffected, all green
  • CI doc-validation workflow green on this branch
  • make validate-docs-live — real WezTerm config-load passes
  • existing ci.yml untouched

🤖 Generated with Claude Code

Tombar and others added 18 commits June 2, 2026 10:12
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…s assert)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ut never used)

Found by the doc-validation harness (test/docs/iterm.bats pyflakes check).
The iTerm2 toggle registers its RPC off `connection`; `app` was dead code.
… distros

- tmux list-keys: accept C-M-t or M-C-t (older tmux reorders modifiers);
  assert the stable toggle-path + #{pane_id} first.
- luacheck: --no-color so ANSI escapes don't split the '0 errors' substring.
- bats --print-output-on-failure for diagnosability.
…ngs REPORT

REPORT.md records per-claim PASS/STATIC/LIVE-MANUAL results and the one doc bug
the harness caught and fixed (iterm.md unused async_get_app).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant